Skip to main content
Production recommendation

For anything over ~100 URLs, use the Batch API. scrape_many() is fine for small lists where you want a simple list-in / list-out contract — but it runs sync requests client-side with threads, which eats your sync rate limit and can't tolerate crashes.

Multiple URLs (scrape_many)

Use scrape_many() whenever you have a list of URLs. It runs requests concurrently and is much faster than a sequential loop.

Basic usage

urls = [
"https://example.com/page/1",
"https://example.com/page/2",
"https://example.com/page/3",
]
results = client.scrape_many(urls, format="markdown")

for url, result in zip(urls, results):
if isinstance(result, Exception):
print(f"FAILED: {url}{result}")
else:
print(f"OK: {url}{len(result.content or '')} chars")
Error handling

scrape_many() never raises. Failed requests return the Exception object in the results list. Always check with isinstance(result, Exception).

Concurrency

By default, concurrency is auto-detected from your plan:

PlanMax concurrent
Free5
Starter10
Growth20
Pro50
Scale100

Override with max_concurrent:

# Force 10 concurrent requests
results = client.scrape_many(urls, max_concurrent=10, browser=True)

With actions, browser + blocking

All scrape() parameters apply to every URL — including actions:

from scrapingpros import WaitForSelectorAction, EvaluateAction, ClickAction

results = client.scrape_many(
urls,
format="markdown",
browser=True,
actions=[
WaitForSelectorAction(selector="css:.content", time=5000),
ClickAction(selector="css:.load-more", optional=True),
EvaluateAction(script="document.querySelectorAll('.item').length"),
],
headers={"Accept-Language": "en-US"},
block_resources=["image", "font", "media"],
block_requests=["google-analytics", "doubleclick"],
retry_on_block=True,
)

Performance

Benchmark on 50 easy/medium URLs with browser=True:

ModeTimeSpeedup
Sequential loop~25 min1x
scrape_many(max_concurrent=5)~5 min5x
scrape_many(max_concurrent=25)~1 min25x
scrape_many(max_concurrent=50)~30s50x

Actual speedup depends on your plan's rate limit and the target sites' response times.

Async version

from scrapingpros import AsyncScrapingPros

async with AsyncScrapingPros("your_token") as client:
results = await client.scrape_many(urls, format="markdown", browser=True)

Heterogeneous requests

When each URL needs different parameters (different actions, headers, etc.), use requests=:

from scrapingpros import ScrapeRequest

results = client.scrape_many(requests=[
ScrapeRequest(url="https://site1.com", browser=True, actions=[...]),
ScrapeRequest(url="https://site2.com", format="markdown", extract={"title": "css:h1"}),
{"url": "https://site3.com", "browser": True}, # dicts work too
], max_concurrent=5)

When to use scrape_many vs collections

scrape_many()Collections
Best forUp to ~500 URLs1,000+ URLs
ExecutionClient-side concurrencyServer-side execution
ResultsImmediate in memoryPoll or webhook
ParametersSame or different per URLDifferent params per URL
Error handlingException in results listFailed jobs in run status

For large batches with per-URL parameters, see Batch Processing.